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Abstract 

The last few years have seen remarkably fast progress in the under- 
standing of statistics and epidemic dynamics of various clustered networks. 
This paper considers a class of networks based around a new concept (the 
locale) that allow exact results to be derived for epidemic dynamics. While 
there is no restriction on the motifs that can be found in such graphs, each 
node must be uniquely assigned to a generally clustered subgraph in this 
construction. 



1 Introduction 

Recent progress on exact analytic approaches to epidemics on clustered net- 
works has been extremely fast. Models have been proposed based on house- 
holds [T7J 131 H] , and the more general concept of local-global networks [TJ [5] . 
Another recent innovation has come from generalisations of random graph the- 
ory [TJI Uni IH], and at the same time, general methods have been proposed for 
manipulation of master equations [16l [15] . These complement the traditional 
epidemiological approach to clustering based on moment closure [9] that has 
recently been applied graphs with more general motif structure [7] . 

This paper draws on much of this recent activity, making three main con- 
tributions. Firstly, a set of networks is defined using the new concept of a 
locale (which is distinct from the recently introduced concept of a role [5]) that 
have no restriction on the motifs that can be present. Secondly, exact epidemic 
dynamics are derived for these networks — the first time that manifestly exact 
results for transient epidemic dynamics of an infinite clustered network with 
non-homogeneous mixing outside the clusters have been derived. Finally, tech- 
niques are presented for practical efficient calculation of quantities of interest. 
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2 General Theory 



2.1 Network generation 

We start with the definition of a network (or graph — we use the terms inter- 
changeably) G of size N as a set of nodes (vertices) V = Zjv, which are indexed 
by ■ ■ ■ S J-n, and a set of links (edges) E C V x V. The information con- 
tained in a network can be encoded in an adjacency matrix A = (Ay), whose 
elements are given by 

A tJ = { 1 if (^)^> (!) 
I otherwise. 

Here we consider symmetric, non-weighted networks without self-links and so 

A-H , A{j — . 

We now present a model for network creation that is both more general than 
previous work, and also allows significant analytic progress to be made. This 
starts by defining a set of objects we call stubby subnets, which are indexed by 
type a. A stubby subnet of type a and size n a consists of three elements: 

1. A set of nodes v a = Z rv ; 

2. A set of within-subnet links, e a Cd°x/, with a within-subnet adjacency 
matrix a CT defined as for A above; 

3. A vector of 'stubs' s CT , such that Vi G v a , s^ £ Z. 

A full network is then constructed in the following way. Firstly, we take a 
number M a ^ 1 of each stubby subnet type, such that the network size and 
nodes are given respectively by 

N = J2M*n a , F = 00/. (2) 

a a rn—1 

Here we use tensor sums © to represent the aggregation of subnet nodes without 
the removal of 'duplicates' that would be implicit in set-theoretic union. We 
can also apply this concept to the within-subnet links, providing one part of the 
full link set, 

Ex=@@e". (3) 

er m— 1 

The remainder of links are then provided by constructing a full vector of 'stubs' 
and connecting these using the standard Configuration Model [TT] . 

S = s a , E 2 = ConfigurationModel(Vs S) , E = E 1 U E 2 ■ (4) 

a m— 1 

In the limit where the network is sufficiently large, no duplicate links will be 
produced through the union of E\ and E%, however for explicit generation of 
finite-size networks, the removal of duplicates implicit in (j4]) is commonly used. 
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Having defined such a network, it is straightforward to calculate degree dis- 
tributions and clustering coefficients, since a node i from a stubby subnet of 
type a has degree and clustering coefficient 

di = 8l + ,and i = JgliL_. (5) 

From consideration of the standard configuration model, a giant component 
emerges within a network of this kind provided 

M a D a {D a - 2) > , where D a := ^ s i ■ ( 6 ) 

<t i=l 

Note that we have implicitly assumed that all stubby subnets are internally 
connected. 



2.2 Invasion and final size 

We now introduce a framework for the determination of whether a network of 
the kind considered can support the invasion of a species obeying SIR dynamics. 
To do this, we define the concept of a locale, which is a stubby subnet of type 
a, together with an 'origin' node o e v a . Clearly, there are at least n a such 
locales to consider, although symmetries may reduce the effective number of 
these. Locale types are denoted using indices like A = (a, o). 

Invasibility of a network of the type under consideration (i.e. one constructed 
from stubby subnets) can therefore be considered by constructing a branching 
process on locales. If we define a 'locale next generation' matrix as the number 
of secondary locales infected by an initially infected locale early in the epidemic, 
then we can use the dominant eigenvalue of such a matrix to define a threshold 
parameter. 

In order to do this, we need to define two dynamical quantities. The first 
of these is T, the probability that infection eventually passes across a network 
link where one node starts infectious and the other susceptible. The second is 
P a (j\o), which is the probability that within the locale (a, o), where infection is 
first introduced to node o, that infection eventually reaches node j £ /. The 
calculation of these two quantities depends on the precise dynamical system 
underneath the transmission process, but once they have been determined, the 
locale next generation matrix (interpreted as the expected number of locales of 
type A = (a, o) created by a locale of type A = (a, o) early in the epidemic) is 
given by 

KL=T^MK-1)+ E P °(j\o) S A , (7) 
where the total number of stubs in the network is 



(8) 



The locale basic reproduction number, which is different from the standard basic 
reproductive number Rq, is then the dominant eigenvalue of this matrix 



R L :=\\IC L \\ . (9) 



By using a 'susceptibility sets' argument as in J3J H] , the final size of an epidemic 
can also be calculated using the following set of transcendental equations: 

r> i So- Mr Ysievv X i 



J2a M an a 

< n (a--p(iii))+nw 



<=((i-^)+^E^^ ; ) * . (io) 
<-(d-n + Ti;^)"'" 1 - 

Here i?oo is the proportion of the population that is ultimately infected by 
the epidemic, x° is the probability that the z-th node in a stubby subnet a 
avoids infection during the epidemic and tt^ is the corresponding probability for 
avoidance of global infection. Variables marked with a tilde represent secondary 
locales in the susceptibility-set branching process, and other quantities are as 
defined above. 



2.3 Full Dynamics 

In order to consider full transient dynamics for the system, we assume that 
transmission of infection across a link is a one-step Poisson process, happening 
at rate r, and that recovery is Markovian with rate 7. Our methodology is 
straightforwardly extended to the case where shedding happens at a variable 
rate during an individual's infectious period or the case of non-exponentially 
distributed recovery times through the method of stages (and other compart- 
mental methods). In the Markovian case, T — r/(r + 7), but to calculate P(i\j) 
we must consider internal dynamics for a subnet of size n with adjacency matrix 
a and infection starting on node o. Since the general dynamics in this case are 
rather hard to write down, we make use of Dirac notation, using the appropriate 
links to Markov chains [6], to simplify notation. 

2.3.1 Within-subnet dynamics 

Our starting point is a node-level state space 

S = {\S),\I),\R)} , (11) 
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Defined such that, where we use letters A,B,... to represent generic states 

(A\B) = 6 A , B . (12) 

We then define five abstract operators: three that return the appropriate infec- 
tion state 

S\S) = \S), S\I) = 0, S\R) = 0, 

i\s) = o, i\i) = \i), i\R) = o, 

R\S) = 0, R\I} = 0, R\R) = \R): (13) 

and two that correspond to transmission and recovery 

t\S) = \I), t\I) = 0, t\R) = 0, 

r\S)=0, f\I) = \R), f\R) = 0. (14) 

So a general state under consideration obeys 

\p) e 5®" , = 1 , where |1) := (\S) + \I) + \R)f n . (15) 

This is in contrast to normalisation in quantum mechanics — where states obey 
(tp\tp) = 1 — and the 'kef |1) is henceforth used without explicit definition to 
stand for an unweighted sum over basis states. Where O is an operator defined 
to act on elements of S, we define an operator acting on the complete state 
space using subscripting so that 




ith place 



Having set up this machinery, we can now write the system's dynamics in an 
extremely compact form: 

j f \p) = Q \P) , where Q = T^iU - Si) J2 + 7^(n - h) ■ (17) 

i j i 

Despite this compact expression, the actual dimensionality of the system above 
grows extremely quickly with network size for numerical and analytical work. 
There are two general methods available for increasing the tractability of these 
equations, particularly for final outcomes. 

Path integrals for Markov chains 

The outcome probabilities for local subnets can be written in terms of the fol- 
lowing integral 

P(j\o)= / (p 3 \e&\o)dt , (18) 
Jo 
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where we have defined two new states 

|o):=/oII4-|l> > b,):=7^-|l) • (19) 

In order to evaluate (fT8|) efficiently, we can make use of the general theory of 
path integrals for Markov chains [14.. To do this, we need first to decompose 
the state space into an absorbing set A and a non-absorbing set C: 

S® n = AllC , (20) 

which can be done through the definition of projection operators 

P A = |A 1 )(g)---«)|A„)(yl 1 |(g)---(8){A„| , 

{A z }? =1 e{S..R}®™ (21) 

P C = 1-P A . 

Two further definitions are needed. Firstly, the time evolution operator re- 
stricted to the non-absorbing states is given by 

Qc-=Q°Pc- (22) 

Secondly, in contrast with quantum mechanics, operators are not Hermitian, 
and so 'transposed' operators that act on the adjoint space of 'bra' states are 
denoted using the dagger | and are not identical to the un-daggered operators 
on 'kef states. Using these definitions, is is possible to write final outcome 
probabilities for the epidemic process in a particularly compact form: 

P(j\o) = {p j \((Qc) i )- 1 \o) . (23) 

This method of path integrals was applied to household epidemic models in [15] . 
In practice, the inverse operator in (f23| need not be calculated in full — for SIR 
dynamics, a matrix representation will exist in which Q is triangular, and so 
quantities of interest can be calculated by solving a system of triangular linear 
equations, which is relatively numerically efficient. 

Automorphism-driven lumping 

Recently, the technique of automorphism-driven lumping has been applied to 
epidemic dynamics on networks lfjj and percolation [8]. This approach re- 
duces the complexity of network problems by making systematic use of discrete 
symmetries of the network. In particular, the automorphism group of a graph 
G of size n with adjacency matrix a is a subset of the permutation group: 
Aut(G) C S n . The elements of the automorphism group leave the adjacency 
matrix invariant: 

M G Aut(G) & a = MaM T . (24) 
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The use of this insight to lump epidemic equations requires some care in the 
labelling of dynamical variables [16]. Using the notation above, we relabel a 
generic dynamical state of the system 

\A 1 )^---^\A n ) = \{(A 1 ,l),...,(A n ,n)}) , (25) 

i.e. we go from an ordered set of states to an unordered set of pairs of states and 
node numbers. 'Lumped' basis states for the dynamical system (j!7[) can then 
be defined according to the orbits of the automorphism group — this means that 
states like the above are lumped together into classes like 

L(Ai, ...,A n ) = { {(A X ,M(1)), (A n ,M(n))} | M g Aut(G) } , (26) 

where M(i) is the index of the non-zero component of the z-th row of the per- 
mutation matrix M. The dynamical equivalence of these states can be seen by 
repeated substitution of a — ► MaM T into dT7| . Clearly, lumping classes must 
contain states that all have the same eigenvalues of S and /; and in the limiting 
case of a fully connected graph such that Aut(G) = S„, only these aggregate 
eigenvalues are required to describe the system [IB] . 



2.3.2 Global dynamics 

Recently, a set of dynamics was presented that are a manifestly exact description 
of the mean behaviour of an SIR epidemic on a configuration-model network [2j 
(equivalent to a stubby subnet model where all subnets have one node) . We now 
re-write this in Dirac notation, so that this approach may be readily combined 
with the within-subnet dynamics above to define exact global dynamics. 

Our starting point is a set of states that represent a number of 'remaining 
half- links' 

S = {\l)}to x > such that (l'\l) = S l>v , (27) 

where fc max is the maximum node degree (or more generally maximum number 
of stubs). We define two operators on such states: a link number operator, and 
a link-number lowering operator: 

?|f> = *IO. ^I0 = (f _1> (28) 
I otherwise. 

We now consider how remaining half-links interact with disease state. These 
are taken as a tensor product, 

\A, I) = \A) <g> \l) , so that (B, l'\A, I) = S a ,bSi,v • (29) 

By construction, however, recovered individuals lose all their half-links, so the 
state space for this system is 

s = {\s,i),\i,i}AR,o}}ttr ■ (30) 
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We then define four operators on this space, which we present in terms of their 
non-trivial action 

t\S,l) := (t\S))®\l) = \I,l) , 

b\S,l) := (t|S»® (r\l)) = \I,l-l) , 

f- \ (31) 
l-\A,l) :=\A)®(l-\l))=\A,l-l) , 

f\I,l) := |i?,0) . 

Three of these operators are simple uplifts, but the operator b for global infection 
is new. To define the dynamics of this system, we start with a general state 

\p)=^2(xi(t)\S,l)+y l {t)\I,l}) + z{t)\R,Q) , (32) 
i 

which obeys 

(l\p) = l , for |1>:= £(|S, l) + \J,l)) + \R,0) . (33) 
i 

There is also a non-linear term for the density of infection amongst free half-links 
that appears in the system, 

Then an exact representation of expected SIR dynamics on a configuration- 
model network is given by 

Q[p] ■= 7 (f - i) + r (r - 1) ii + P [p] ( 7 + t) (r - 1) i + p\p}r (S - s) i , 
| b) - Q[p] Ip) • 

(35) 

The significance of these dynamics is that they do not grow in size with network 
size; in fact, they are exact in the infinite-size limit, which is inaccessible through 
simulation or direct integration of (|17|) . 



2.3.3 Full system dynamics 

For a network made up of stubby subnets, it is possible to a make the same 
construction as above, where global links are made along with the epidemic 
process. In this case, a general state can be written 

\P)= ^ 1 '"\.,.i„( t )k>®l4il>8-®|4JnJ, (36) 

where (a\a) = 5 StCy as would be expected. Clearly, any attempt to write down 
differential equations for the tensor representation of this system, p a Al '" A "i 1 i n (t), 
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will involve extremely complex expressions. By contrast, using the formalism of 
Dirac notation and operators that we have developed above, we can write the 
exact dynamics for this system as 

Pa-= E \<r)®\ A uh)®---®\ A n„> l nr}(Ai r ,ln„\®---®( A 1.ih\®(<T\ 

(ilE.ELV^b) ' 

Q[P\ ■■= 7 E (* " + T E (t - l) Mi + r E (** " &) E a i5*i p « 

i i i a.j 

+ P[P] (7 + r) E (*f - l) l< + ~ ^ . 

i i 

^ b) = Q\p] \p) ■ 

(37) 

These equations have the same significance as above: the exact expected epi- 
demic dynamics of a class of clustered dynamics can be calculated for the 
infinite-size limit of a network. 



3 Examples 

We now turn to some examples of the methodology presented above to specific 
networks. Throughout this section we work in natural units such that the 
recovery rate 7=1. 



3.1 Invasion and final size 

We consider invasion on the two locales shown in Figure [TJ These networks 
are constructed from the envelope / diamond motif as shown, so that every 
individual has exactly n links. This means that all differences between this 
model and an rt-regular random graph derive from the presence and structure 
of short loops in the network and not heterogeneity in node degree. The locale 
basic reproductive ratio is given by: 

R L = (r(2(n - 3) 2 + (n(25n - 142) + 204)r + (n(133n - 716) + 982)t 2 
+ (n(377n - 1948) + 2570)t 3 + (n(563n - 2846) + 3672)t 4 
+ 2(n(193n - 968) + 1239)r 5 + 12 (8 (n - 5)n + 51)t 6 )) 

/((2n - 5)(1 + r) 4 (l + 2r) 2 (l + 3r)) . (38) 

Final sizes are calculated using (|10[) . In Panes (c) and (d) of Figure [U to 
compare the asymptotically exact results (blue line) with finite-size networks, 
10 6 Simulations were run for envelope-based networks of size 100 and 1000, 
with n = 4, over a range of transmission parameter values. For comparison, 
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theoretical curves were also plotted for a configuration model where every node 
has four links (red line), and a household model for households of size 4 where 
every individual has one stub (green line). Final sizes for these two comparators 
are a special case of the analysis in [3J 0]. Clearly, all of these comparator 
networks also have the property that neighbourhood sizes are uniformly equal 
to four, and so all outcome differences are due to local clustering structure. 

The results shown in Panes (c) and (d) of Figure [T] show firstly that this 
local clustering structure does have a significant impact on epidemic outcomes, 
and also that even for relatively small networks the results of simulation demon- 
strate this difference and agree well with the asymptotic result. The blue line 
representing final outcomes also has two interesting features: there is a short 
plateau of small but finite final sizes above the invasion threshold; and for very 
fast transmission, the predicted final sizes are larger than for the unclustered 
regular graph. 

3.2 Full Dynamics 

While invasion thresholds are of practical interest, transient dynamical features 
of epidemics are also important, and are not always simply determined by con- 
sideration of thresholds. Figure [2] shows the exact transient behaviour for two 
special graphs, both of which give all nodes degree three: (a) a configuration- 
model network where each node has 3 stubs; (b) a stubby-subnet graph com- 
posed of triangles with each node having one stub. The dynamics as defined 
above give the epidemic curves shown in (c) for the CM network and (d) for the 
triangle-based network respectively. 

These show the interesting feature that once we are in a region of much 
faster transmission than is required for invasion, the clustered network exhibits 
later but higher peaks — an analogue of the lager final sizes seen for clustered 
networks at very large r above. 

4 Other solvable networks 

It has been clear for some time that a network (or otherwise structured popula- 
tion) with a local-global distinction will admit a solution to an epidemic on that 
network pQ. As a practical adjunct to this, both the local and global features 
of the network must individually admit solution. The stubby-subnet networks 
here propose one such distinction: each node can be uniquely assigned to a local 
unit of clustered structure; and global mixing happens through a configuration 
model network. 

We now consider three other versions of this concept, firstly by introducing 
assortative mixing outside the subnet, secondly using the recently defined role- 
based networks, and finally to weighted networks. 
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4.1 Assortativity 

In [T3] , a generalisation of the configuration model was developed to incorporate 
the notion of assortativity. Such assortativity (or even disassortativity) is a 
mainstay of epidemiology, and much theoretical effort has been expended to 
model its effects [5] . To describe assortativity, we introduce a correlation matrix 
C AA (analogous to the eu of [12]) that multiplies the probabilities that two 
locales are linked globally compared to the configuration model. For such a 
network, the locale next generation matrix is 



and an appropriate threshold parameter will be given by the dominant eigen- 
value of this matrix. Exact transient dynamics for such a system should also 
be straightforward to write down: in addition to indexing a node with its ef- 
fective remaining half-links and disease state, each node should also be indexed 
by locale. Instead of having homogeneous transmission on the basis of pairing 
half- links at rate r, the rate should then be multiplied by C\ \ - Of course, this 
yields equations that are at least quadratic rather than linear in maximum node 
degree, making numerical integration correspondingly more difficult. 

4.2 Role-based networks 

Role-based networks as considered in [T3] [TU] [5] involve a different definition of 
local and global. In these networks, it is links that can be uniquely assigned to 
a local unit of clustered structure, meaning that nodes can be attached to many 
different clustered subgraphs. This clearly allows a next-generation matrix to 
be established by indexing cases by the unit of structure through which they 
acquired infection, as in [10]. The definition of manifestly exact dynamics is 
less clear in this case, however dynamical approaches such as [18 that are in 
extremely good numerical agreement with simulation, and may turn out to be 
exact through further work, can clearly be extended to role-based networks. The 
primary differences between stubby-subnet and role-based networks are that the 
former can specify an exact structure of stubs for each node in a clustered motif, 
while the latter can involve each node in several motifs. As such, these are best 
seen as complementary approaches to the fast-moving field of solvable clustered 
networks. 

4.3 Weighted networks 

While all networks discussed above have been topological (i.e. links are either 
present or not) all of the analysis above carries through exactly if within-subnet 
links are weighted, so afj G R. It is also possible to stratify global links into 
multiple contexts, each with a given strength (i.e. different values of T) although 



■L 
AA 



T 




(39) 
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this latter modification does increase the system dimensionality, while weight- 
ing within-subnet dynamics does this only if the weighting breaks a discrete 
symmetry of the topological network. 
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Figure 1: Epidemics on envelope / diamond motif-based networks, (a) and (b) 
show the two locales involved. Bottom panes show final sizes for 10 6 simulations 
on networks of size (c) 100 and (d) 1000. Each translucent dot represents a 
realisation; blue lines are asymptotic predictions for a regular graph of degree 
4; red lines are the asymptotic predictions for the envelope network with n = 4; 
and green lines are asymptotic predictions for four-cliques with one global link 
per node. 
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Figure 2: Exact transient epidemic dynamics for two special networks, (a) shows 
a typical location in the unclustered graph, and (b) shows a typical location in 
the clustered graph. Epidemic curves (grey) for different parameter values are 
shown in (c), (d) respectively. Peak times (blue) and peak heights (red) are 
projected onto the appropriate axes. 
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